Towards Code Generation for the Synchronous Control Asynchronous Dataflow (SCAD) Architectures
نویسندگان
چکیده
Many recent processor architectures expose their datapaths so that the compiler can not only schedule instructions to increase instruction-level concurrency, but can even take care of moving values between the processing units of the processor to optimize their allocation at compile-time. In this paper, we introduce with the Synchronous Control Asynchronous Dataflow (SCAD) paradigm another exposed datapath architecture and discuss how code can be generated best for SCAD architectures. While traditional compilers focus on register usage and therefore evaluate expressions usually by a depth-first traversal, we show that compiler techniques more adequate for SCAD should better focus on a breadth-first evaluation of expressions as known from queue machines. This way, they can forward values from one processing unit to another one without using registers at all. However, these machines sometimes have to make use of additional swap and duplication operations that add some overhead to the actual computation. Since a queue machine can be simulated by a universal SCAD machine, we can derive SCAD programs from queue programs. Moreover, if the queue program does not contain swap or duplication operations, the SCAD program is optimal for a single processing unit, where ‘optimal’ refers to the minimal number of swap and duplication operations. However, we also show that sometimes SCAD programs can avoid this overhead even if queue machines need it, which makes SCAD code generation more difficult.
منابع مشابه
Compositionality in dataflow synchronous languages : specification & distributed code generation ∗ † ‡ Albert
Modularity is advocated as a solution for the design of large systems, the mathematical translation of this concept is often that of compositionality. This paper is devoted to the issues of compositionality for modular code generation, in dataflow synchronous languages. As careless reuse of object code in new or evolving system designs fails to work, we first concentrate on what are the additio...
متن کاملSystem-level Clustering and Timing Analysis for GALS-based Dataflow Architectures
In this paper, we propose an approach based on dataflow techniques for modeling application-specific, globally asynchronous, locally synchronous (GALS) architectures for digital signal processing (DSP) applications, and analyzing the performance of such architectures. Dataflow-based techniques are attractive for DSP applications because they allow application behavior to be represented formally...
متن کاملCompositionality in Dataflow Synchronous Languages: Specification and Distributed Code Generation
Modularity is advocated as a solution for the design of large systems, the mathematical translation of this concept is often that of compositionality. This paper is devoted to the issues of compositionality for modular code generation, in data ow synchronous languages. As careless reuse of object code in new or evolving system designs fails to work, we rst concentrate on what are the additional...
متن کاملEliminating Nondeterminism to Enable Chip-Level Test of Globally-Asynchronous Locally-Synchronous SoC’s
Globally asynchronous locally synchronous (GALS) clocking applied to a system-on-a-chip (SoC) results in a design in which each core is a synchronous block (SB) of logic whose locally generated clock has an independent frequency and phase. Data is exchanged between cores using an asynchronous communication protocol. The nondeterministic synchronization strategies used by most GALS architectures...
متن کاملSynchro-Tokens: Eliminating Nondeterminism to Enable Chip-Level Test of Globally-Asynchronous Locally-Synchronous SoC’simpact are detailed
Globally asynchronous locally synchronous (GALS) clocking applied to a system-on-a-chip (SoC) results in a design in which each core is a synchronous block (SB) of logic with a locally generated clock. Inter-core communication is asynchronous and controlled by wrapper logic around the cores. The nondeterministic synchronization used by most GALS architectures makes chip-level silicon debug and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016